[VL] Add Velox batch resizer copyRanges fast path#12101
Open
zhli1142015 wants to merge 1 commit into
Open
Conversation
Add a default-enabled VeloxBatchResizer fast path that collects small dense batches, allocates the output RowVector once, and bulk-copies child vector ranges with copyRanges. The config remains available as an opt-out switch. Wire the flag through Scala, Java, and JNI, add C++ coverage for fast-path and fallback behavior, add config default coverage, and add dense-vector benchmark scenarios comparing the append opt-out baseline, default copyRanges path, direct child copyRanges, reader-side raw payload bulk-copy model, and pre-merged flush model. Benchmark results from velox_batch_resizer_benchmark (CPU time; ASLR enabled, so numbers may have noise): - Mixed_64x64: append opt-out baseline 95.1us, default copyRanges 19.7us, direct child copyRanges 17.4us, raw bulk-copy model 33.3us. - Mixed_16x256: append opt-out baseline 33.7us, default copyRanges 6.4us, direct child copyRanges 5.0us, raw bulk-copy model 10.5us. - Mixed_256x16: append opt-out baseline 217.7us, default copyRanges 50.4us, direct child copyRanges 28.6us, raw bulk-copy model 112.6us. - Fixed2_64x64: append opt-out baseline 26.6us, default copyRanges 5.5us, direct child copyRanges 2.0us, raw bulk-copy model 13.7us. - Fixed16_64x64: append opt-out baseline 121.6us, default copyRanges 27.0us, direct child copyRanges 17.4us, raw bulk-copy model 92.9us. - LongString_64x64: append opt-out baseline 31.7us, default copyRanges 7.1us, direct child copyRanges 4.5us, raw bulk-copy model 15.3us. - BoolHeavy_64x64: append opt-out baseline 68.7us, default copyRanges 10.9us, direct child copyRanges 5.4us, raw bulk-copy model 37.7us. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Add a default-enabled VeloxBatchResizer fast path that collects small dense batches, allocates the output RowVector once, and bulk-copies child vector ranges with copyRanges. The config remains available as an opt-out switch.
Wire the flag through Scala, Java, and JNI, add C++ coverage for fast-path and fallback behavior, add config default coverage, and add dense-vector benchmark scenarios comparing the append opt-out baseline, default copyRanges path, direct child copyRanges, reader-side raw payload bulk-copy model, and pre-merged flush model.
Benchmark results from velox_batch_resizer_benchmark (CPU time; ASLR enabled, so numbers may have noise):
Mixed_64x64: append opt-out baseline 95.1us, default copyRanges 19.7us, direct child copyRanges 17.4us, raw bulk-copy model 33.3us.
Mixed_16x256: append opt-out baseline 33.7us, default copyRanges 6.4us, direct child copyRanges 5.0us, raw bulk-copy model 10.5us.
Mixed_256x16: append opt-out baseline 217.7us, default copyRanges 50.4us, direct child copyRanges 28.6us, raw bulk-copy model 112.6us.
Fixed2_64x64: append opt-out baseline 26.6us, default copyRanges 5.5us, direct child copyRanges 2.0us, raw bulk-copy model 13.7us.
Fixed16_64x64: append opt-out baseline 121.6us, default copyRanges 27.0us, direct child copyRanges 17.4us, raw bulk-copy model 92.9us.
LongString_64x64: append opt-out baseline 31.7us, default copyRanges 7.1us, direct child copyRanges 4.5us, raw bulk-copy model 15.3us.
BoolHeavy_64x64: append opt-out baseline 68.7us, default copyRanges 10.9us, direct child copyRanges 5.4us, raw bulk-copy model 37.7us.
What changes are proposed in this pull request?
How was this patch tested?
Was this patch authored or co-authored using generative AI tooling?